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ABSTRACT 

A  version  of  the  reduced  control  space  four-dimensional  variational  method  (R4DVAR)  of  data  assimi¬ 
lation  into  numerical  models  is  proposed.  In  contrast  to  the  conventional  4DVAR  schemes,  the  method  does 
not  require  development  of  the  tangent  linear  and  adjoint  codes  for  implementation.  The  proposed 
R4DVAR  technique  is  based  on  minimization  of  the  cost  function  in  a  sequence  of  low-dimensional  sub¬ 
spaces  of  the  control  space.  Performance  of  the  method  is  demonstrated  in  a  series  of  twin-data  assimilation 
experiments  into  a  nonlinear  quasigeostrophic  model  utilized  as  a  strong  constraint.  When  the  adjoint  code  is 
stable,  R4DVAR's  convergence  rate  is  comparable  to  that  of  the  standard  4DVAR  algorithm.  In  the 
presence  of  strong  instabilities  in  the  direct  model,  R4DVAR  works  better  than  4DVAR  whose  performance 
is  deteriorated  because  of  the  breakdown  of  the  tangent  linear  approximation.  Comparison  of  the  4DVAR 
and  R4DVAR  also  shows  that  R4DVAR  becomes  advantageous  when  observations  are  sparse  and  noisy. 


1.  Introduction 

In  the  past  two  decades,  the  methods  of  oceano¬ 
graphic  data  assimilation  into  numerical  models  have 
undergone  a  significant  progress  from  the  early  works  of 
Le  Dimet  and  Talagrand  (1986)  and  Thacker  (1988)  to 
solution  of  the  increasingly  complex  problems,  reflected 
in  a  series  of  monographs  by  Bennett  (1992),  Wunsch 
(1996),  Evensen  (2006),  and  Talagrand  and  Bouttier 
(2009),  among  others. 

Most  recently,  research  in  data  assimilation  has  an 
apparent  trend  toward  the  studies  of  the  ensemble- 
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based  sequential  techniques  (Evensen  2003;  Ott  et  al. 
2004;  Zupanski  2005;  Uzunoglu  et  al.  2007).  These 
methods  utilize  low-dimensional  ensembles  of  model 
states  to  approximate  propagation  of  error  covariances 
that  are  vital  for  improvement  of  practical  weather 
forecast.  At  the  same  time,  the  classic  strong  constraint 
four-dimensional  variational  data  assimilation  (4DVAR) 
methods  still  remain  an  important  tool  in  atmospheric  and 
oceanic  data  analysis  in  both  global  (Wenzel  et  al.  2001; 
Stammer  et  al.  2003;  Blessing  et  al.  2008)  and  regional 
(Zupanski  et  al.  2005;  Yaremchuk  2006;  Di  Lorenzo  et  al. 
2007)  applications.  The  strong  constraint  methods  are  of 
particular  importance  in  oceanography  where  the  data 
coverage  is  sparse  and  observations  are  less  accurate. 

With  the  ever-growing  complexity  and  resolution  of 
the  ocean  general  circulation  models  (OGCMs),  con¬ 
straining  them  by  4DVAR  methods  is  hampered  by  the 
following  difficulties: 

1)  High  computational  cost  of  4DVAR  optimization. 

OGCMs  have  the  typical  state  vector  dimension  of 
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106-107  whereas  the  number  of  independent  obser¬ 
vations  is  only  an  order  in  magnitude  smaller  and 
growing.  On  the  other  hand,  optimal  estimation  of 
an  ocean  state  with  classical  4DVAR  methods  re¬ 
quires  the  number  of  model  runs  [including  those 
of  the  tangent  linear  (TL)  and/or  adjoint  models] 
comparable  with  the  number  of  observations  or 
model  state  dimension,  that  is  computationally  pro¬ 
hibitive.  As  a  consequence,  applications  of  4DVAR 
methods  are  limited  to  finding  suboptimal  solutions, 
obtained  after  10-100  iterations  of  the  minimization 
procedure. 

2)  High  maintenance  cost  of  the  adjoint  and  tangent 
linear  codes.  A  significant  part  of  the  programming 
burden  related  to  the  maintenance  of  the  adjoint  and 
TL  codes  cannot  be  automated  at  the  present  state  of 
the  adjoint  compilers.  In  addition,  keeping  the  ad¬ 
joint  and  TL  codes  updated  in  parallel  with  the 
perpetually  upgraded  models  is  labor  intensive  and 
prone  to  human  errors. 

3)  Breakdown  of  the  tangent  linear  approximation  (TLA). 
In  the  presence  of  strong  physical  instabilities  of  the 
background  state  applicability  of  TLA  is  restricted 
to  relatively  short  time  intervals  (e.g.,  Oldenborgh 
et  al.  1999).  Furthermore,  the  TL  and  adjoint  codes 
of  the  community  OGCMs  never  represent  exact  TL 
or  adjoint  operators,  especially  when  model  physics 
contains  parameterized  discontinuities  (Zhu  et  al. 
2002). 

To  resolve  these  difficulties  the  focus  of  research  has 
recently  shifted  toward  the  development  of  the  reduced 
control  space  4DVAR  (R4DVAR)  methods.  As  a  few 
examples,  Robert  et  al.  (2005)  utilized  the  empirical 
orthogonal  (EOF)  analysis  of  the  model  trajectory  for 
the  definition  of  the  reduced  control  space  and  param¬ 
eterization  of  the  background  error  covariance.  Robert 
et  al.  (2006)  explored  preconditioning  of  the  incre¬ 
mental  4DVAR  assimilation  by  the  R4DVAR  method. 
Qiu  et  al.  (2007)  studied  a  possibility  to  use  an  ensemble 
of  randomly  perturbed  model  states  for  generation  of 
the  reduced  control  by  singular  value  decomposition. 
Another  strategy  studied  by  (Cao  et  al.  2007;  Daescu 
and  Navon  2007)  is  based  on  the  reduction  of  the  model 
itself  using  EOF  approach.  Although  the  latter  tech¬ 
nique  improves  computational  efficiency,  the  issue  of 
finding  an  optimal  low-dimensional  state  subspace  re¬ 
mains  an  open  question. 

This  paper  presents  a  version  of  the  reduced  control 
space  4DVAR  data  assimilation  method.  In  contrast  to 
previous  studies  (e.g.,  Robert  et  al.  2006;  Daescu  and 
Navon  2007;  Qiu  et  al.  2007;  Liu  et  al.  2008),  which 
utilize  a  fixed  EOF-generated  subspace  for  optimization, 


our  algorithm  employs  a  sequence  of  low-dimensional 
subspaces  that  are  iteratively  updated  in  the  process  of 
finding  a  minimum  of  the  cost  function. 

The  paper  is  organized  as  follows:  in  the  next  section 
we  first  present  linear  considerations,  underlying  the 
development  of  the  scheme  and  outline  the  algorithm. 
In  section  3  the  setup  of  the  twin-data  experiments  is 
described  and  the  issue  of  the  adjoint  code  instability  is 
considered.  In  section  4  we  present  the  results  of  the 
twin-data  experiments  and  compare  the  method  with 
the  standard  4DVAR  technique.  The  conclusions  are 
presented  in  section  5. 

2.  Linear  background 

In  a  linear  context,  a  4DVAR  iterative  procedure 
employs  the  adjoint  code  for  exact  multiplication  of  an 
arbitrary  vector  by  the  Hessian  matrix.  For  computa¬ 
tional  reasons,  however,  the  number  of  iterations  in 
practical  problems  is  often  limited  by  a  few  hundred. 
Therefore,  such  solutions  should  be  treated  as  optimi¬ 
zations  on  the  subspace  spanned  by  a  limited  number  of 
eigenvectors  of  the  Hessian  matrix. 

In  this  regard,  the  ability  to  retrieve  leading  eigen¬ 
vectors  of  the  Hessian  matrix  is  of  primary  importance 
for  practical  purposes.  In  this  section,  we  consider  a 
simplified  linear  variational  data  assimilation  problem 
controlled  by  the  initial  conditions,  employ  EOF  anal¬ 
ysis  of  the  model  solutions  to  obtain  low-dimensional 
approximations  to  the  Hessian  structure,  and  construct 
a  minimization  method,  which  does  not  require  the 
adjoint  algorithm. 

a.  EOF  analysis  of  model  solutions 

As  discussed  by  Farrell  and  Ioannou  (2001,  2006),  an 
optimal  reduction  of  a  dissipative  normal  dynamical 
system  can  be  represented  as  Galerkin  projection  of  the 
dynamics  onto  the  least  damped  modes.  In  this  study, 
we  consider  control  of  model  solutions  by  the  initial 
conditions.  Therefore  it  will  be  instructive  to  consider 
first  the  impact  of  initial  conditions  on  the  result  of  EOF 
analysis  of  the  corresponding  model  solution  in  the 
simplified  case  of  the  normal  dynamical  operator. 

Consider  a  model  of  oceanic  circulation  drx  =  Mx, 
where  x  is  the  vector  of  the  state  variables  and  M  is 
a  normal  differential  operator  with  a  full  set  of  ei¬ 
genfunctions  p.k  and  corresponding  eigenvalues  \k, 
which  satisfy  the  conditions  Re{A/£(  <  0  and  A  = 
minJImAj  >0.  If  M  is  time  independent,  a  model  so¬ 
lution  corresponding  to  the  initial  state  x°  is  x(f)  =  x° 
exp(Mf).  In  terms  of  p.k  the  solution  is  represented  by 
the  expansion  x(r)  =  Tk  ak  exp(A kt)pk,  where  ak  are  the 
projections  of  x°  on  the  eigenstates  of  M. 
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A  standard  technique  widely  used  for  initialization  of 
the  sequential  data  assimilation  schemes  (e.g.,  Robert 
et  al.  2005)  is  the  EOF  analysis  of  the  covariance  matrix 
C  generated  by  the  time  averaging  of  a  model  run  over  a 
sufficiently  long-time  interval  T.  When  written  in  terms 
of  the  covariance  matrix  is 


C  —  x(t)xT(t)  =  y 


X  aka *  exp[(Afc  +  \*)t\p,kp.f  dt 


o  k,l 


V  exp[(Afc+Af)r]-l  T 
‘  — (At  +  A?)r  ^  ’ 


k,l 


much  less  than  M,  a  background  model  state  x°  and  its 
inverse  error  covariance  B  are  utilized  for  regularization 
of  the  problem. 

Introducing  notations  Q  =  \/TZ  O  ,  d  =  y/R  dn  and 
A"  for  the  propagator  between  r  and  tn,  the  standard  for¬ 
mulation  of  the  4DVAR  assimilation  problem  (e.g., 
Bennett  1992)  can  be  written  down  as 


J=\ 


(X°~x°)1B(x0-x°) 


+  X  (9„ A"x°  -  d")T(gn A”x°  -  d") 


min. 

X° 


(2) 


where  the  T  symbol  denotes  transposition  and  the  as¬ 
terisk  stands  for  the  complex  conjugate.  If  the  averaging 
time  is  long  enough  (A  T  — >  °°),  the  off-diagonal  ele¬ 
ments  of  C  vanish  in  the  basis  and  its  largest  ei¬ 
genvalue  Am,  satisfying  the  condition  Re{A  }/A  — >  0, 
can  be  estimated  as 


2  exp(2Re{Am}T)  —  1 
2Re{Am}r 


K\2F(Re{\JT). 

(1) 


Minimization  of  J  is  can  be  reduced  to  solution  of  the 
normal  equation: 

^=(b+Xa'^,A'')x° 

-  (bx°+Xa,,t^5'!^  =0,  (3) 

which  can  be  rewritten  as  Hx°  =  b,  where  b  =  Bx°  + 
I  A "TOlKndn  and 


The  expression  in  (1)  shows  that  the  leading  eigenvalues 
of  C  are  the  squares  of  the  largest  p  components  of  x° 
corresponding  to  eigenstates  of  M  with  the  smallest 
damping  Re{A).  Since  dissipation  in  M  usually  selec¬ 
tively  damps  high-frequency  modes,  the  leading  eigen¬ 
vectors  of  C  tend  to  capture  the  largest  spatial  scales  of 
model  variability,  which  is  also  typical  for  the  results  of 
EOF  analyses  of  the  oceanic  data.  At  the  same  time,  the 
largest  spatial  scales  are  the  least  prone  to  parameteri¬ 
zation  errors  of  the  dissipative  processes  in  OGCMs. 
This  property  makes  the  EOF  decomposition  of  a 
model  solution  an  efficient  tool  for  selectively  retrieving 
those  components  of  x°  that  are  most  accurately  pro¬ 
jected  by  a  numerical  model  on  the  data  points  dis¬ 
tributed  over  a  given  time  interval. 

b.  Krylov  subspace  methods  and  the  large  least 
squares  problems 

Now  consider  a  problem  of  4DVAR  into  a  linear 
dynamical  system  d,x  =  Mx,  where  x  6  TZM  is  the  state 
vector  of  the  ocean  and  M  is  a  time-dependent  linear 
operator.  The  model  solutions  are  controlled  by  the 
initial  state  x°  =  x(?°).  Observations  dn  of  x  are  made  at 
times  tn,  n  =  0,  . . .  ,  N,  with  N  M.  We  adopt  a  linear 
model  for  observations  dn  =  Onx(tn )  +  s,  where  e  is  the 
spatially  uncorrelated  noise  with  the  inverse  covariance 
Rn  and  linear  operators  On  project  the  model  states  x(f'!) 
onto  the  observed  quantities  d".  Since  the  total  number 
of  observations  in  the  time  interval  [r,  tN]  is  usually 


H=^=B  +  XA',T^A"  (4) 

d(xu)  « 

is  the  Hessian  matrix.  Assuming  that  the  symmetric 
M  X  M  matrix  B  could  be  represented  as  B  =  QTQ,  it  is 
convenient  to  rewrite  the  minimization  problem  (2)  in  a 
symmetric  form: 

/=^(Sx°-d)T(Sx°-d),  (5) 

where 


Q 

-Qx°- 

s  = 

Qx  A 

,  d  = 

1 

.  < 

_ i 

.  dN  _ 

are  the  “square  root”  of  H  =  STS  and  the  normalized 
data  vector,  respectively. 

The  large  and  sparse  system  of  linear  equations  (3) 
could  be  solved  by  means  of  the  Krylov  subspace 
methods  that  form  an  orthogonal  basis  {e„,}  on  the  se¬ 
quence  H'”r0,  m  =  1,  . . .  ,  M,  where  r0  =  Hx$  —  b  is  an 
arbitrary  initial  residual  vector.  The  approximations  to 
the  solution  are  obtained  by  minimizing  the  residual 
over  the  Krylov  subspaces  JC"  spanned  by  jem[-  The 
well-known  generalized  minimum  residuals  (GMRES), 
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conjugate  gradient,  and  biconjugate  gradient  methods 
can  be  viewed  as  particular  applications  of  the  Krylov 
subspace  technique  (e.g.,  Saad  2003). 

As  is  seen  from  (4),  multiplication  of  the  arbitrary 
residual  vector  by  the  Hessian  matrix  requires  an  al¬ 
gorithm  for  multiplication  of  the  vector  by  A"1  (the 
adjoint  model).  The  latter  may  often  be  unavailable 
and/or  expensive  to  run  and  implement  (see  section  1). 
Therefore,  it  might  be  useful  to  explore  a  possibility  of 
constructing  an  “adjointless”  iterative  scheme  for  the 
minimization  problem  in  (2),  which  is  based  on  the 
Krylov  subspace  technique. 

In  this  regard,  inspection  of  (4)  shows  that  computa¬ 
tion  of  the  higher  powers  of  H  for  construction  of  the 
Krylov  subspaces  may  seem  to  be  redundant,  because, 
for  example,  HA/  contains  terms  with  powers  of  A  up  to 
2 NM  whereas  a  complete  basis  in  1ZM  could,  in  princi¬ 
ple,  be  built  on  the  sequences  {A  kTQlQk AM,  {&A*r}.  or 
{A/lr}  with  k  =  1 M  (assume,  for  a  moment,  that 
both  A  and  QjQk  have  the  full  rank).  This  observation 
gives  some  grounds  to  examine  numerical  data  assimi¬ 
lation  schemes,  which  do  not  require  multiplication  by 
At  in  the  construction  of  Krylov  subspaces. 

The  simplest  approach  is  to  build  JC™  on  the  powers 
of  A.  Although  this  method  does  not  explicitly  take  into 
account  the  structure  of  B  and  Qk,  it  tends  to  selectively 
extract  the  least  damped  components  of  r  (i.e.,  those 
components  that  are  most  accurately  projected  by  A1  on 
the  data;  section  2a).  This  property  could  be  achieved  by 
extracting  the  leading  eigenvectors  of  the  covariance 
matrix  C  =  I„(A"r)(A"r)  r  through  the  EOF  analysis  of 
its  dual  C *(/,  k)  =  (A'r,  Akt),  where  angular  brackets 
denote  inner  product  1ZM . 

The  second  adjointless  approach  is  to  build  JC™  on 
the  sequence  {QnA!'t}.  This  method  requires  a  definition 
of  the  operators  V  to  project  the  model  counterparts  of 
the  data  on  the  state  space  at  times  t'\  Inspection  of  (4) 
shows  that  specifying  Vn  =  A "TQj,  provides  a  “natural” 
scheme  JCk  =  span{AAT  QkQkAkr\.  This  method,  how¬ 
ever,  should  be  discarded,  as  it  contains  AT.  Mathe¬ 
matically,  there  is  a  considerable  freedom  in  defining 
the  projection  operators:  the  only  restriction  imposed 
on  V  by  the  requirement  of  convergence  of  the  Krylov 
scheme  to  the  solution  of  (2)  is  to  keep  the  rank  of  the 
modified  ST  intact  (Hayami  et  al.  2007).  This  can  be 
achieved,  for  example,  by  replacing  AT  by  A  in  the  ex¬ 
pression  for  ST.  Computationally,  such  a  replacement 
will  require  an  additional  model  run  (as  a  substitution  of 
the  adjoint)  for  the  generation  of  JC".  In  the  present 
study,  we  take  another  approach  and  utilize  the  back¬ 
ground  error  covariance  B  1  for  projection  by  setting 
Vn  =  (Pl'Dn)~1gl  with  Vj  =  [££,Qt].  This  choice  could 
be  supported  by  the  fact  that  the  covariance  propagates 


in  time  by  model  dynamics  and  thus  provides  a  measure 
for  the  distance  between  the  model  states  x(t)  in  terms 
of  their  value  at  t  =  0. 

One  can  expect  that  this  second  method  of  generating 
JCk  may  converge  faster,  because  it  takes  into  account 
the  structure  of  Qk  and  B  in  generating  the  Krylov 
spaces  and,  therefore,  may  give  better  approximations 
to  H  than  the  first  method. 

c.  Practical  implementation 

In  this  section  we  describe  a  4DVAR  assimilation 
method  based  on  successive  minimizations  of  the  cost 
function  performed  in  low-dimensional  Krylov  sub¬ 
spaces  JC™  spanned  by  projections  of  the  residuals  on 
the  approximations  to  the  leading  eigenmodes  of  H  and 
C  in  different  experiments.  The  proposed  technique 
exploits  low  computational  cost  of  both  EOF  analysis 
and  explicit  inversions  of  the  Hessian  operators  in  JC™. 

The  optimization  procedure  starts  with  a  first-guess 
state  Xq,  whose  time  evolution  is  subsampled  to  retrieve 
the  first  Krylov  space  JC™  via  EOF  decomposition  of  the 
corresponding  covariance  matrix  C.  The  corresponding 
projection  operator  V0  is  represented  by  the  M  X  m 
matrix,  whose  columns  are  the  eigenvectors  {//)}  of  C. 
The  dimension  m  of  JC™  is  somewhat  arbitrary,  but 
should  be  small  enough  to  reduce  the  computational  cost. 
Objectively,  m  can  be  chosen,  for  example,  as  the  number 
of  modes,  explaining  a  certain  portion  1  —  £  of  the  model 
variability  defined  by  the  observational  noise  level  £. 

After  specifying  the  first  Krylov  subspace,  the  gradi¬ 
ent  V07  in  JC™  is  computed  by  perturbing  Xo  with  j/r°} 
and  taking  the  finite  differences  of  J.  Simultaneously, 
we  obtain  the  projection  of  the  Hessian  operator  H0  = 
VoHVo  in  JCQ  using  the  approach  of  Zupanski  (2005). 
Next,  the  value  of  control  x1,1  after  the  first  iteration  is 
computed  as 

x?=x°-x°-x°-P0H0-1V0/.  (7) 

Note  that  in  the  case  of  linear  dynamics,  x?  corresponds 
to  the  exact  and  unique  minimum  of  J  in  JC™ .  In  the 
nonlinear  case  considered  in  the  next  section,  it  is  nec¬ 
essary  to  execute  several  iterations  of  this  “internal” 
optimization  loop  to  reach  a  local  minimum. 

The  second  “external”  iteration  starts  with  the  EOF 
analysis  of  xj(t),  which  generates  the  next  Krylov  sub¬ 
space  JC™.  Note  that  the  residual  control  x?  does  not 
contain  the  JC0  components  of  x°  (denoted  by  x§), 
which  already  explain  a  certain  portion  of  the  data.  To 
remove  them  from  JC™,  the  basis  in  JC™  X  JC™  is  or- 
thogonalized  using  the  Gram-Schmidt  process. 
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Next,  we  store  the  suboptimal  control  x°  =  Xq  and 
exclude  it  from  optimization  process  by  updating  the 
cost  function  in  (5)  and  minimizing  it  in  JC": 

J^J=l  [S(x°  +  x°)  -  d]T[S(x°  +  x°)  -  dl  ->  min  . 

/  vOc  'Ym 


Minimization  in  is  performed  by  the  procedure 
outlined  by  (7).  In  a  similar  manner  we  proceed  with 
further  iterations. 

In  general,  on  the  /th  iteration,  we  first  update  the  sub- 
optimal  control  x°  <—  x°  +  x°,  find  0Ci+1  through  the 
EOF  analysis  of  x,(f),  orthogonalize  the  basis  in 
X  OC™ .  update  the  cost  function  in  (8),  and  then 
minimize  it  in  to  obtain  the  next  contribution  x°+1 
to  the  suboptimal  control  vector  x°.  Numerically,  x?  are 
iteratively  accumulated  in  a  single  array  x°  and  do  not 
require  additional  memory  resources. 

d.  Comparison  with  GMRES 

The  proposed  algorithm  belongs  to  the  family  of  Krylov 
subspace  methods  widely  used  for  solution  of  the  large 
sparse  linear  systems  of  equations.  Therefore  it  is  instruc¬ 
tive  to  compare  it  with  the  well-known  limited-memory 
generalized  minimal  residual  (GMRES,,,)  scheme  (Saad 
2003).  When  applied  to  the  system  of  equations  Hx  =  b, 
the  GMRES,,,  algorithm  [m  =  sup(dim  JC)]  employs  the 
following  iterative  loop: 

1)  Given  the  approximate  solution  x  on  the  At  h  itera¬ 
tion,  generate  Krylov  spaces  spanned  by  {T>'[H]r},  i  = 
1,  . . .  ,  A,  where  P  '[H]  are  the  /th-order  polynomials 
in  H  and 

r  =  b  —  Hx  (9) 

is  the  residual.  The  polynomials  V'  are  generated 
sequentially  by  the  Arnoldi  process  (Arnoldi  1951) 
to  form  orthonormal  bases  in  OC . 

2)  Compute 

/  =  (H£-r)T(Hf-r)  ->  min  (10) 

feX' 

and  update  the  approximation  to  the  solution: 
x  <-  x  +  f . 

3)  If  the  updated  residual  is  small  enough,  exit;  other¬ 
wise,  go  to  step  1  and  either  increase  the  dimension 
of  the  Krylov  subspace  by  generating  the  next-order 
polynomial,  or  (if  i  =  m)  start  building  the  new  se¬ 
quence  of  3C  using  the  updated  residual. 

It  can  be  proved  that  if  m  =  M,  the  GMRES  algo¬ 
rithm  provides  the  exact  solution  (Saad  2003).  More¬ 


over,  the  Krylov  space  can  be  built  on  the  powers  of  any 
matrix,  whose  null  space  is  H-orthogonal  to  b. 

The  proposed  R4DVAR  algorithm  has  two  major 
differences  from  the  classic  GMRES,,,.  Both  of  them  are 
dictated  by  the  necessity  to  avoid  multiplication  by  AT. 
First,  the  Krylov  spaces  are  built  not  on  the  powers  of  H, 
but  on  the  sequences  of  the  operators  approximating  the 
entries  of  its  square  root  in  (6).  Second,  since  computa¬ 
tion  of  the  residual  in  (9)  requires  the  adjoint  code,  we 
adopt  an  alternative  expression  r  =  xQ  —  x  implicitly 
multiplying  (9)  by  H .  This  modification  does  not  affect 
the  convergence  properties  of  the  algorithm  as  soon  as 
the  first-guess  vector  x0  contains  all  the  spectral  compo¬ 
nents  of  H  that  are  needed  for  decomposition  of  b. 

Other  distinctions  from  the  classic  GMRES„,  scheme 
are  purely  technical: 

1)  Orthogonalization  of  the  basis  in  is  done  not  by 
the  Arnoldi  process,  but  via  EOF  analysis  of  the 
sample  covariance  matrix  built  on  the  Krylov  vectors 
{A'r)  or  {ViQj A'r}. 

2)  Minimization  in  the  Krylov  subspace  is  done  by  di¬ 
rect  computation  of  the  gradients  and  inversion  of 
the  Hessian  in  3Cm .  From  the  computational  point  of 
view  this  is  approximately  equivalent  to  the  GMRES 
minimization  scheme,  which  employs  the  Gramm 
matrix  generated  by  the  Arnoldi  process. 

3)  The  residual  cost  function  (10)  is  taken  in  its  original 
form  (8),  which  can  be  considered  as  a  square  root 
form  of  (10).  This  allows  us  to  estimate  the  cost 
function  by  summing  the  squares  of  the  residuals  in 
the  data  space  and  thus  avoid  utilization  of  the  adjoint 
code,  which  is  necessary  for  estimation  of  (10). 

From  the  mathematical  point  of  view  the  proposed 
algorithm  should  be  equivalent  to  full  (m  =  M)  GMRES 
under  two  conditions:  1)  the  rank  of  ST  is  kept  intact  by 
Pk;  and  2)  the  first-guess  vector  contains  all  the  spectral 
components  of  b.  These  conditions  do  not  seem  to  be  too 
restrictive  in  applications,  because  a  relatively  good  first- 
guess  approximation  is  often  available  in  the  form  of  the 
background  state  x°.  Note,  however,  that  in  contrast  to 
full  GMRES,  which  may  work  with  any  first -guess  vector, 
the  proposed  algorithm  relies  on  the  quality  of  x°. 

In  the  numerical  experiments  below  we  demonstrate 
the  algorithm’s  performance  in  a  typical  “oceanographic 
application”  when  neither  the  background  state  nor  its 
error  covariance  is  available.  In  such  situations  the 
“best”  first-guess  state  has  to  be  retrieved  from  the  data, 
the  background  state  is  taken  to  be  zero,  and  its  error 
covariance  is  modeled  by  a  low-pass  filter.  This  approach 
to  error  covariance  modeling  has  gained  considerable 
attention  in  recent  years  (e.g.,  Weaver  and  Courtier  2001; 
Pannekoucke  and  Massart  2008). 
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Fig.  1.  Streamfunction  i jj  of  the  reference  solutions  with  (top)  v  =  50  m2  s  1  and  (bottom)  v  =  500  m2  s  Asterisks  denote  the  data 
points  of  the  sampling  grids  used  in  twin-data  experiments.  The  contour  interval  is  3000  m2  s-1. 


3.  Twin-data  experiments 

To  assess  the  performance  of  the  method,  we  con¬ 
ducted  twin-data  experiments  with  a  nonlinear  model 
controlled  by  the  initial  conditions.  The  underlying  idea 
is  to  generate  reference  model  solutions,  sample  them 
using  a  simulated  “observational  array,”  contaminate 
the  samples  by  noise,  and  then  reconstruct  the  reference 
solution  from  these  samples  using  the  assimilation 
method  under  study  (hereafter  R4DVAR). 

A  nonlinear  model  is  chosen  for  two  reasons:  1)  it  is 
more  realistic  than  the  linear  one  in  the  sense  of  ap¬ 
plications,  and  2)  it  has  an  ability  to  generate  reference 
solutions  with  unstable  tangent  linear  and  adjoint  models. 
The  latter  situation  is  quite  common  in  practice,  and  it 
was  interesting  to  compare  R4DVAR  with  the  standard 
4DVAR  in  that  case. 

a.  Numerical  model 

We  consider  a  quasigeostrophic  model  in  a  square 
33  X  33  grid  Cl  with  a  spatial  resolution  of  8x  =  15  km 
(Fig.  1): 

7  1 

dtq  +  /((//,  Ai//)  +  /35ri/f=  vA-i/t+  -curl,T,  (11) 

-  RjV  =  <7,  (12) 


where  i//  is  the  streamfunction  in  the  upper  layer,  /j  is 
the  meridional  gradient  of  the  Coriolis  parameter,  Rj  is 
the  internal  Rossby  radius  of  deformation,  and  v  is  the 
horizontal  diffusion  coefficient. 

At  the  spinup  stage  the  model  is  forced  for  1000  days 
by  a  steady  wind  stress  curl  pattern: 


curl  r  = 

Z 


_o 

L 


sin 


cos 


Here  t0  =  5  X  10  5  m2  s  2.  L  =  480  km  is  the  horizontal 
size  of  the  domain,  and  x*,  y*  are  Cartesian  coordinates 
rotated  40°  with  respect  to  the  north-south  direction. 
The  other  model  parameters  are  h  =  700  m,  ji  = 
2  X  1 0  m  1  s  ',  and  R,/  =  25  km.  The  horizontal 
diffusion  coefficient  v  was  either  50  or  500  nr  s  1  in 
different  experiments. 

Since  the  boundary  conditions  are  assumed  to  be 
known  (i//|an  =  Ai/f|aa  =  0),  the  model  is  controlled  by  the 
initial  distribution  of  the  potential  vorticity  field  q(x,  y,  0). 
Equations  (11)  and  (12)  are  integrated  in  time  using  the 
leapfrog  scheme  with  a  time  step  of  0.05  days.  Therefore, 
the  number  of  adjusted  parameters  M  (gridpoint  values 
of  q  at  t  =  1000  and  t  =  1000.05  days)  is  2  X  312  =  1922. 

After  the  spinup  the  wind  was  switched  off  and  the 
model  was  run  in  an  unforced  regime  for  T  =  45  days, 
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producing  the  reference  solutions  ipIcC  (Fig.  1)  for  the 
twin-data  experiments. 

b.  Instability  of  the  tangent  linear  model 

The  reference  solution  with  v  =  50  m2  s_1  is  charac¬ 
terized  by  a  high  degree  of  nonlinearity  (the  typical 
value  of  | /(*//,  <w)|  exceeds  dissipation  and  /3-effect  terms 
by  an  order  in  magnitude).  As  a  consequence,  TL  and 
adjoint  codes  turn  to  be  unstable.  This  instability  has  an 
e-folding  time  scale  re  of  approximately  15  days,  that  is 
several  times  less  than  the  typical  dissipation  time  Tlt  = 
8xzlv  of  the  grid-scale  harmonics.  This  property  of  the 
model  significantly  degrades  the  performance  of  the 
traditional  4DVAR  scheme,  based  on  the  adjoint  code. 
We  tested  the  validity  of  TLA  by  perturbing  q  at  t  =  0 
with  a  test  function: 

8q(x,y)  =  e sin^l3Tr^  sin^llrr^ 
and  estimated  the  quantity: 


Fig.  2.  TLA  errors  <f>(e)  of  the  model  solutions. 


<F(e) 


*q+Sq  ~  ~ 

IWI 


\cp\  dlldt, 
n 


where  ijt  is  the  streamfunction  produced  by  integrating 
the  model  from  initial  conditions  specified  by  the  sub¬ 
script  and  if/1  denotes  the  solution  of  the  tangent  model 
linearized  in  the  vicinity  of  ijjq.  As  shown  in  Fig.  2,  the 
correct  asymptotic  behavior  of  the  Taylor  expansion 
<F(e)  ~  e2,  (e  — >  0)  is  observed  only  with  v  =  500  nr  s  1 
(i.e.,  when  the  dissipation  time  scale  is  comparable  with  r,,). 

Exponential  growth  of  the  small-scale  harmonics  in 
the  tangent  linear  and  adjoint  codes  could  be  sup¬ 
pressed  by  increasing  the  viscosity  to  a  “stable”  value 
(v  =  500  m2  s  1 :  B.  Cornuelle  2006,  personal  commu¬ 
nication).  This  approach  usually  improves  the  perfor¬ 
mance  of  the  adjoint  4DVAR  schemes  in  the  cases 
when  TLA  breaks  down  because  of  nonlinear  instabil¬ 
ities.  However,  it  does  not  improve  the  accuracy  of  TLA 
and  may  degrade  it  even  further  (Fig.  2).  As  a  rule,  TLA 
breakdown  causes  certain  difficulties  in  the  performance 
of  descent  algorithms,  making  the  adjoint  4DVAR  in¬ 
efficient  after  several  iterations. 


c.  Simulated  observations 

Observations  ifj£n  were  picked  from  the  first-guess 
solutions  at  /„=  ]  2,3  =  15,  30,  and  45  days  at  the  spatial 
locations  specified  by  “sparse”  and  “dense”  measure¬ 
ment  arrays  (Fig.  1)  and  contaminated  by  white  noise 
e^,  whose  rms  variation  e||i/frcl||/(flT)  was  varied  (e  = 
0,  0.1,  0.3).  To  regularize  the  problem  we  also  specified 


the  “bogus  data  set”  A2i//  =  0  at  tn  =  0,  15,  30,  and 
45  days.  The  corresponding  cost  function  is 


3  K 


J  =  - 


n=lk=\ 


3 


+wsX[A2-KO]2 

n=0 


dti, 


where  Ok  projects  i//(/„)  on  the  Arth  observation  point, 
K  =  16(64)  is  the  number  of  observation  points  at  time 
layer  n,  and  Ws  =  0.035x4  is  the  smoothing  weight.  The 
dimension  of  the  observational  space  (including  both 
real  and  bogus  datapoints)  is  na  =  312  X  4  +  3 N  = 
3  N  +  3844. 

Following  the  notation  of  section  2b,  the  observa¬ 
tional  operator  for  n  =  1, 2, 3  is  represented  in  the  matrix 
form: 


Q  = 


'  K 

So* 

k= 1 

VW  A2 


-2crl 


[A-Rrf-2E 


where  E  is  the  identity  matrix.  For  n  =  0  the  observa¬ 
tional  operator  is  Q  =  \/WiA2(A  —  The  cor¬ 

responding  term  of  the  cost  function  can  be  interpreted 
as  the  background  term  with  <7*  =  0  and  the  inverse 
background  error  covariance  B  =  QTQ. 

We  performed  two  sets  of  the  twin-data  assimilation 
experiments:  with  the  stable  ( v  =  500  m2  s_1)  and  un¬ 
stable  ( v  =  50  m2  s_1)  adjoint  models.  Within  each  set 
we  varied  the  number  of  observations  N  =  {16,  64}  and 
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t=0  days 


t=0  days 


the  noise  level  in  the  data  e  =  {0,  0.1,  0.3}.  In  the 
R4DVAR  assimilation  experiments  we  also  varied  the 
dimension  of  3Cm  m  =  {8,  15}  and  the  type  of  EOF 
decomposition  in  the  process  of  generation  of  the 
Krylov  spaces  (section  2b).  In  the  R4DVAR  analysis 
the  value  of  m  is  limited  by  the  number  of  the  time 
samples  N.  To  bypass  this  limitation  the  model  fields 
sampled  at  t n  were  augmented  with  additional  samples 
taken  daily  between  observations. 

To  assess  the  R4DVAR  performance  we  checked  its 
convergence  rate  against  4DVAR  for  every  set  of  pa¬ 
rameters.  In  the  unstable  case  (v  =  50  m2  s_1)  the  ad¬ 
joint  model  was  stabilized  by  setting  v  =  500  m2  s 
otherwise  it  was  impossible  to  find  a  minimum  of  J  with 
a  reasonable  accuracy. 

The  quality  of  reconstruction  of  the  reference  solu¬ 
tions  was  gauged  by  the  error  parameter: 

2_II*A--Are£ll 

*  ll^ll  ‘ 

d.  First-guess  solutions 

To  obtain  the  first-guess  set  of  basis  functions  in  the 
R4DVAR  case  we  used  the  following  procedure:  first, 
the  values  of  »  =  1 , ....  3  were  linearly  interpolated 

on  the  model  grid;  second,  the  interpolated  fields  i[>k(x,  y) 
were  smoothed  by  a  biharmonic  filter,  whose  transfer 
function  was  tuned  to  suppress  the  interpolation  noise 
and  noise  in  the  data;  third,  potential  vorticity  distri¬ 
butions  qk(x,  y)  were  computed  as  =  (A  —  R^2)i//k, 
forth,  q  i  (x,  y)  and  <r/2(x,  y)  were  integrated  for  30  and 
15  days,  respectively,  and  subsampled  with  3-day  dis¬ 
cretization;  finally,  q3(j <c,  y)  was  added  to  17  samples 


obtained  from  the  integrations  and  the  whole  set  of 
18  fields  was  subjected  to  EOF  analysis. 

Figure  3  shows  comparison  of  the  qIcC,  q\  and  the 
spectra  EOF(<7rer),  EOF(^.  i8)  of  the  first-guess  sam¬ 
ples  retrieved  from  the  “dense”  data  with  e  =  0, 0.1,  and 
0.3  in  the  unstable  case.  It  is  seen  that  the  reference 
solution  can  be  described  with  a  relatively  high  preci¬ 
sion  using  only  15-20  eigenmodes  of  the  covariance 
function  qret(x,y,  t)qTef(x'  ,y' ,  t),  whereas  the  first-guess 
spectra  differ  considerably  in  their  properties  from  the 
reference  one  (right  panel  in  Fig.  3).  The  difference  is 
caused  by  the  dominance  of  the  large-scale  modes  and 
exhibits  itself  in  much  steeper  decay  of  the  spectra.  As  a 
consequence,  the  first-guess  solutions  are  well  approxi¬ 
mated  by  only  five-seven  eigenmodes  of  the  respective 
covariance  functions.  The  corresponding  values  of  e. , 
however,  are  rather  large  and  vary  in  between  0.42  for 
e  =  0  and  0.51  for  the  e  =  0.3. 

4.  Results 

We  conducted  a  series  of  60  twin-data  assimilation 
experiments  using  the  adjoint  and  R4DVAR  assimila¬ 
tion  techniques.  The  major  purpose  was  to  compare  the 
convergence  rates  of  both  methods  and  estimate  their 
potential  ability  to  retrieve  the  reference  state  from 
the  data.  Results  of  the  experiments  are  assembled  in 
Tables  1  and  2. 

In  the  R4DVAR  analyses  we  also  compared  the  per¬ 
formance  of  the  two  methods  of  generating  the  Krylov 
subspaces  outlined  in  section  2b.  The  first  one  is  based 
on  EOF  decomposition  of  the  sample  covariance  matrix 
C  (experiments  labeled  F8’15),  whereas  the  second  one 
also  accounts  for  the  background  covariance  and  the 
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TABLE  1.  Results  of  the  assimilation  experiments  with  the  stable  adjoint  model  (v  =  500  m2  s 

-1). 

sif/ 

Sx 

Adjoint 

178 

£// 

Es 

F15 

£// 

E15 

eiff 

JIJo  (X103) 

e>f/ 

JUa  (X103) 

exff 

JIJo  (X103) 

e\ff 

JIJo  (X103) 

JIJo  (X103) 

0.0 

4 

0.034 

0.44 

0.040 

1.25 

0.037 

1.32 

0.040 

1.07 

0.038 

1.08 

8 

0.196 

3.61 

0.156 

3.74 

0.193 

4.97 

0.122 

2.62 

0.116 

2.75 

0.1 

4 

0.086 

8.52 

0.103 

8.74 

0.107 

10.1 

0.059 

7.13 

0.058 

7.21 

8 

0.192 

11.8 

0.143 

5.89 

0.174 

7.63 

0.139 

5.65 

0.144 

5.79 

0.3 

4 

0.181 

60.1 

0.145 

56.2 

0.164 

57.8 

0.136 

53.8 

0.136 

53.9 

8 

0.257 

52.1 

0.271 

38.9 

0.280 

39.9 

0.297 

33.4 

0.308 

35.6 

structure  of  the  observation  operators  (experiments 
Eh15).  Because  4DVAR  method  failed  to  converge  in 
the  unstable  case  because  of  the  breakdown  of  the  TLN 
approximation,  we  prescribed  v  =  500  m2  s  1  in  the  ad¬ 
joint  model  for  both  stable  and  unstable  runs.  In  4DVAR 
experiments  the  limited-memory  quasi-Newtonian  de¬ 
scent  algorithm  of  Byrd  et  al.  (1995)  was  used.  Iterations 
were  terminated  when  either  the  relative  reduction  of 
the  cost  function  was  less  than  machine  precision  1(U10, 
or  the  number  of  iterations  exceeded  3000.  In  the 
R4DVAR  experiments  we  performed  several  H  1  pre¬ 
conditioned  iterations  [Eq.  (7)].  As  a  rule,  convergence 
was  fast,  and  never  required  more  than  three  iterations 
of  the  internal  minimization  loop. 

After  some  tuning  we  found  that  the  best  overall 
convergence  rate  was  achieved  when  the  control  sub¬ 
space  was  updated  as  soon  as  the  gradient  in  the  inner 
loop  reduced  more  than  50  times  in  magnitude.  This 
criterion  was  used  in  to  compute  the  values  listed  in 
Tables  1  and  2,  which  compare  e^,  and  the  relative  re¬ 
duction  of  the  cost  function  JIJ0  between  the  assimila¬ 
tion  experiments.  As  seen  from  the  tables,  R4DVAR 
outperforms  4DVAR,  in  most  cases  providing  better  fit 
to  the  reference  state  and  greater  reduction  of  the  cost 
function.  The  only  exception  is  the  case  of  dense  ob¬ 
servational  array  with  perfect  observations  (first  line  in 
Table  1). 

Comparing  columns  2-4  and  3-5  in  both  tables  also 
shows  that  performance  of  R4DVAR  is  better  for  m  =  15 
eigenfunctions.  Experiments  with  larger  m  (not  shown) 
did  not  improve  the  rate  of  convergence.  We  attribute 


this  to  the  spectral  properties  of  the  reference  solution, 
which  show  that  qlcl  can  be  approximated  by  15  eigen- 
modes  with  an  error  of  7%  (dashed  line  in  Fig.  3c). 

Differences  in  the  values  of  e,/,  and  JIJ0  in  columns  2-3 
and  4-5  indicate  that  Eh  experiments  provide  a  some¬ 
what  better  convergence  rate:  the  final  values  oiJIJ0  are 
lower,  when  compared  with  those  obtained  using  EOF 
analysis  of  C.  The  advantage  is  particularly  evident  for 
e  =  0  and  0.1  with  m  =  15  in  the  unstable  case  and  m  =  8 
in  the  stable  case.  In  terms  of  e,,,  the  difference  in  per¬ 
formance  between  the  methods  is  less  evident,  espe¬ 
cially  for  sparse  observations  and  e  =  0.3.  This  can  be 
partly  explained  by  a  tendency  to  data  overfitting  by  the 
Eh  method,  which  tends  to  converge  faster,  whereas 
Ws  was  not  fine  tuned  to  adequately  account  for  the 
noise  level. 

Figures  4  and  5  compare  convergence  rates  of  the 
R4DVAR  technique  and  the  4DVAR  method.  To 
simplify  the  comparison,  the  number  of  4DVAR  itera¬ 
tions  is  shown  below  the  horizontal  axis  whereas  the 
equivalent  (in  CPU  terms)  number  of  R4DVAR  inner 
loop  iterations  is  shown  above  the  axis.  Since  the  value 
of  m  is  low,  the  CPU  time  required  by  EOF  analysis  and 
covariance  estimation  contributes  only  a  small  fraction 
to  the  total  computational  cost  of  the  R4DVAR,  which 
is  almost  entirely  determined  by  the  number  of  model 
runs  ( m  +  1)  required  for  gradient  estimation.  The  ad¬ 
joint  code  in  our  case  required  10%  more  CPU  time 
than  the  direct  run,  so  that  one  R4DVAR  iteration  was 
approximately  equivalent  to  5  4DVAR  iterations  for 
m  =  8  and  8  iterations  for  m  —  15. 


Table  2.  As  in  Table  1,  but  for  the  unstable  adjoint  model  (v  =  50  m2  s  *). 


Adjoint  Eh  £8  Eh  E 15 


£i// 

Sx 

fl 

JIJo  (XlO3) 

JIJo  (XlO3) 

e* 

JIJo  (XlO3) 

(1 

JIJo  (XlO3) 

ei// 

JIJo  (XlO3) 

0.0 

4 

0.182 

41.1 

0.099 

11.9 

0.106 

12.1 

0.089 

10.1 

0.095 

10.5 

8 

0.391 

56.3 

0.341 

29.6 

0.339 

29.9 

0.224 

14.6 

0.278 

20.3 

0.1 

4 

0.242 

59.5 

0.128 

18.5 

0.135 

19.3 

0.129 

16.8 

0.125 

16.9 

8 

0.389 

70.0 

0.288 

26.3 

0.272 

26.9 

0.265 

17.4 

0.304 

24.1 

0.3 

4 

0.252 

105. 

0.174 

55.1 

0.182 

60.0 

0.167 

51.2 

0.168 

51.4 

8 

0.376 

102. 

0.424 

62.1 

0.417 

64.7 

0.356 

46.0 

0.328 

47.3 
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Fig.  4.  Relative  reduction  of  the  cost  function  JIJ0  for  experiments  with  the  stable  model,  (middle)  Evolution  of  e ^  with  iterations. 


Figures  4  and  5  show  that  4DVAR  demonstrates 
faster  convergence  at  the  initial  stages  of  assimilation 
when  the  number  of  4DVAR  iterations  is  i  <  m. 
However,  the  R4DVAR  catches  up  after  m-4m  itera¬ 
tions,  performs  similarly  for  the  stable  case  (Fig.  4),  and 
outperforms  4DVAR  in  the  case  when  TLA  is  broken 
(Fig.  5).  The  effect  becomes  more  visible  at  higher  noise 
levels  and  sparser  sampling:  in  these  cases  J  has  a  larger 
number  of  local  minima,  and  the  4DVAR  algorithm  in 
the  stable  case  tends  to  terminate  as  soon  as  it  en¬ 
counters  the  first  one  (Figs.  4b,c).  R4DVAR  has  a  ca¬ 
pability  to  search  over  the  surroundings  and  eventually 
find  a  deeper  minimum.  The  sparsely  sampled  unstable 
experiment  with  zero  noise  (Fig.  5c)  provides  an  inter¬ 
esting  example  of  this  property:  the  4DVAR  scheme 
failed  at  the  76th  iteration  because  of  the  loss  of  the  de¬ 
scent  direction,  providing  a  “suboptimal”  initial  condition 
shown  in  the  left  panel  of  Fig.  6.  The  R4DVAR  scheme 
proceeded  further,  and  was  able  to  retrieve  an  anticy- 


clonic  eddy  in  the  northwestern  corner  of  the  domain 
(middle  panel  in  Fig.  6).  Note  that  this  eddy  is  barely 
captured  by  the  observational  grid  only  at  the  last  day  of 
the  model  run  (Fig.  1,  rightmost  panel  in  the  top  row). 

As  it  is  seen  from  Table  2  and  Fig.  5,  the  R4DVAR 
technique  is  especially  advantageous  in  the  unstable 
case,  mostly  because  of  its  robustness  with  respect  to 
dynamical  instabilities.  In  contrast,  the  4DVAR  algo¬ 
rithm  terminated  because  of  the  line  search  failure  in  all 
the  unstable  cases.  At  sparser  sampling  and  higher  noise 
levels  the  termination  occurred  much  earlier,  often  after 
less  than  100  iterations  (Figs.  5b, c). 

Finally,  building  the  Krylov  subspaces  using  model- 
data  projection  Q  and  B  (experiments  EH)  proves  to  be 
advantageous  in  terms  of  convergence  rates,  especially 
at  the  late  stages  of  the  assimilation  process  when  sparse 
and/or  noisy  data  are  assimilated  (Figs.  4c  and  5).  At 
these  stages  prior  statistics  imposed  by  the  smoothness 
constraint  begins  to  play  its  role,  as  the  contribution  of 


Fig.  5.  Error  in  the  approximation  of  the  reference  solution  e ^  for  experiments  with  the  unstable  model. 
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Fig.  6.  Optimized  streamfunction  at  t  =  0  obtained  by  the  (left)  4DVAR  and  (middle)  R4DVAR  methods  for  the  unstable  case  at  sparse 
resolution,  (right)  The  reference  solution  is  shown.  The  Cl  is  3000  m2  s_1. 


the  smoothness  term  to  the  cost  function  becomes 
comparable  with  the  term,  penalizing  misfit  with  the 
real  data.  As  a  consequence,  directions  toward  the 
minimum  of  the  cost  function  obtained  by  the  standard 
EOF  expansion  become  more  and  more  distorted  by  the 
“observed”  zero  values  of  A2*//.  One  may  expect  that  in 
the  case  of  nonlocal  model-data  projection  operators 
the  effect  could  be  visible  at  the  earlier  stages  as  well. 

5.  Discussion  and  conclusions 

In  this  paper,  a  version  of  the  reduced  control  space 
4DVAR  data  assimilation  method  is  proposed.  In  con¬ 
trast  to  previous  studies  (e.g.,  Robert  et  al.  2006;  Daescu 
and  Navon  2007;  Qiu  et  al.  2007;  Liu  et  al.  2008),  which 
utilized  a  fixed  EOF-generated  subspace  for  optimization, 
our  algorithm  employs  a  sequence  of  low-dimensional 
subspaces  that  are  iteratively  updated  in  the  process 
of  finding  a  minimum  of  the  cost  function.  A  similar 
optimization  technique  was  utilized  by  Vermeulen  and 
Heemink  (2006),  Cao  et  al.  (2007),  and  Fang  et  al.  (2009), 
but  their  approach  involved  construction  of  the  reduced 
model  and  its  adjoint,  which  are  not  required  in  our  case. 

The  algorithm  is  tested  in  the  framework  of  twin-data 
assimilation  experiments  with  a  nonlinear  quasigeo- 
strophic  model  controlled  by  the  initial  distribution  of 
potential  vorticity.  Robustness  of  the  method  is  com¬ 
pared  with  the  standard  4DVAR  technique  based  on  the 
adjoint  code.  Our  results  can  be  summarized  as  follows: 

1)  Compared  to  4DVAR,  the  proposed  method  provides 
similar  or  better  reduction  of  the  cost  function  after 
several  updates  of  the  search  subspace.  In  terms  of  the 
computational  cost,  R4DVAR  performs  similarly  with 
4DVAR  if  the  latter  is  terminated  after  more  than 


2m-4m  iterations,  where  m  is  the  (fixed)  dimension  of 
the  reduced  control  subspaces  (Figs.  4  and  5). 

2)  The  proposed  method  gains  substantial  advantage 
over  4DVAR  when  the  dynamical  constraints  have 
strong  nonlinear  instabilities,  which  cause  the  break¬ 
down  of  TLA. 

3)  Compared  to  4DVAR,  the  proposed  method  gains 
extra  efficiency  when  observations  become  more 
sparse  and/or  noisy. 

The  proposed  technique  is  based  on  application  of  the 
Krylov  subspace  method  targeted  on  the  specific  type  of 
cost  functions  encountered  in  4DVAR  problems.  The 
technique  has  a  lot  in  common  with  the  GMRES,„  al¬ 
gorithm  and  appears  to  be  equivalent  to  GMRES  in  the 
limit  m  =  M  under  the  conditions  specified  in  section  2. 
The  distinct  feature  of  our  approach  is  construction  of 
the  low-dimensional  subspaces  not  on  the  powers  of  H, 
but  on  the  approximations  to  the  operators  entering  its 
square  root  in  (6).  Because  these  operators  are  pro¬ 
portional  to  the  powers  of  the  dynamical  operator,  we 
employ  the  EOF  analysis  of  the  model  trajectories  built 
on  the  control  space  residuals  to  extract  the  functions 
spanning  the  low-dimensional  search  spaces  3Cm. 

Similar  to  other  R4DVAR  methods,  the  proposed 
technique  does  not  require  development  and  mainte¬ 
nance  of  the  adjoint  code.  It  is  also  very  efficient  in 
terms  of  parallelization,  since  the  major  portion  of  CPU 
time  is  consumed  by  m-independent  model  runs  required 
for  gradient  computation  in  the  Krylov  subspaces.  Re¬ 
garding  parallelization,  one  may  expect  an  additional 
10%-30%  gain  in  computational  cost  of  the  R4DVAR 
method  when  it  is  applied  to  state-of-the-art  OGCMs, 
whose  parallelization  efficiency  scales  nonline  arly  with 
an  increase  of  the  number  of  processors. 
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Another  potential  CPU  time  gain  in  OGCM  appli¬ 
cations  should  be  acquired  if  we  consider  computational 
efficiency  of  the  adjoint  models.  As  a  rule,  adjoint  codes 
of  the  community  OGCMs  with  4DVAR  capabilities 
(especially  those  generated  by  automatic  compilers) 
require  2-5  times  more  CPU  time  than  the  direct  codes, 
whereas  the  adjoint  of  our  simple  model  is  only  10% 
slower  than  the  direct  one. 

One  particular  advantage  of  the  R4DVAR  approach 
is  that  it  eliminates  the  necessity  to  “stabilize”  the  ad¬ 
joint  model  in  the  presence  of  nonlinear  instabilities 
(e.g.,  Zhu  and  Kamachi  2000;  Zhu  et  al.  2002)  by  sim¬ 
plification  and/or  modification  of  the  numerical  scheme. 
These  approximations  lead  to  a  certain  loss  of  accuracy 
of  the  tangent  linear  approximation  (Fig.  3)  and  degrade 
the  performance  of  descent  algorithms. 

Another  benefit  of  the  method  is  that  it  implicitly 
regularizes  the  problem  through  the  ranking  of  the 
search  subspaces  in  the  course  of  assimilation:  during 
the  first  iterations  the  smoothest  approximations  to 
initial  conditions  are  recovered,  they  are  later  refined  in 
subspaces  containing  higher-order  spatial  harmonics.  In 
fact,  results  of  R4DVAR  assimilation  were  weakly 
sensitive  to  the  magnitude  of  Ws,  especially  in  the  cases 
of  dense  observations  and  moderate  noise  levels. 

Our  numerical  experiments  have  also  shown  that  the 
efficiency  of  the  proposed  technique  is  sensitive  to  the 
quality  of  the  first-guess  subspace  !KQ.  For  instance, 
when  JCQ  is  generated  by  an  ensemble  of  white  noise 
perturbations  as  proposed  by  Qiu  et  al.  (2007),  the  de¬ 
scent  became  inefficient  and  required  an  order  in 
magnitude  increase  either  in  the  subspace  dimension  or 
in  the  number  of  iterations  needed  to  achieve  the  con¬ 
vergence.  This  phenomenon  can  be  explained  by  a  small 
projection  of  the  seed  perturbations  on  the  optimal 
state.  A  similar  effect  has  been  observed  in  the  twin- 
data  experiments  of  Qiu  et  al.  2007,  who  considered  a 
smooth  background  state  of  a  2D  shallow-water  model 
on  the  44  X  44  grid  but  had  to  use  150  ensemble 
members  for  a  reasonably  accurate  assimilation. 

In  that  respect  it  is  necessary  to  note  that  in  the  limiting 
case  of  “zero  quality”  of  the  first-guess  state  (x0  is  H- 
orthogonal  to  b,  section  2d),  the  proposed  algorithm  will 
fail  in  the  linear  case.  This  is  not  necessarily  true  in  the 
nonlinear  case,  because  in  the  process  of  model  inte¬ 
gration  nonlinearities  may  generate  state  vector  compo¬ 
nents  that  are  not  present  in  the  first-guess  solution. 
These  components  affect  the  composition  of  3Cm  and 
may  eventually  span  b  with  iterations.  In  operational 
applications  the  very  bad  quality  of  the  background  state 
seems  to  be  unlikely,  because  a  reasonably  good  ap¬ 
proximation  to  reality  is  already  available  either  from 
previous  assimilation  cycles  or  from  the  preliminary  data 


analysis  (simulated  in  section  3d).  Note  that  in  all  the 
reported  R4DVAR  experiments  we  did  not  use  any  prior 
information  on  the  solution  except  that  contained  in  the 
data  itself  and  in  a  simple  smoothness  constraint. 

Further  improvements  of  the  method  can  be  done  in 
several  directions.  First  of  all,  a  better  approximation  to 
ST  could  be  developed.  In  this  study  we  used  the  back¬ 
ground  error  covariance  to  project  QkQkAkr  on  the  state 
space.  In  many  applications,  however,  the  background 
error  covariance  is  rank  deficient  as  it  is  approximated  by 
n  —50-100  leading  eigenmodes,  emerging  from  statistical 
analysis.  In  such  situations  the  condition  of  Hayami  et  al. 
(2007)  may  be  violated,  causing  inability  of  the  scheme  to 
retrieve  data  components  not  present  in  the  spectrum  of 
B.  More  secure  adjointless  projections  could  be  sug¬ 
gested,  that  involve  replacing  AT  by  A  or  by  its  “non¬ 
linear  approximation.”  The  latter  can  be  obtained,  for 
example,  by  reversing  the  sign  of  the  odd-order  differ¬ 
ential  operators  in  A.  These  projections  may  seem  more 
robust  as  compared  to  the  one  with  low-rank  B  because 
dimensions  of  their  null  spaces  are  much  smaller  than 
M  -  n,  and  their  structure  may  differ  only  marginally 
from  the  null  space  of  AT.  Another  possible  improve¬ 
ment  that  could  be  done  is  augmenting  the  Krylov 
matrices  with  projections  of  the  residuals  on  the  certain 
eigenfunctions  of  B,  if  the  latter  are  readily  available. 

There  is  also  a  room  for  increasing  the  computational 
efficiency  of  the  proposed  algorithm.  One  of  the  direc¬ 
tions  is  applying  more  sophisticated  methods  for 
extracting  the  basis  in  JCm ,  such  as  direct  SVD  de¬ 
composition  of  the  Krylov  matrices.  Another  improve¬ 
ment  could  be  obtained  by  adaptive  adjustment  of  the 
Krylov  space  dimensions.  One  of  the  possible  strategies 
in  that  respect  is  the  entropy  analysis  of  the  Hessian 
spectra  in  3Cm  (Uzunoglu  et  al.  2007). 

The  most  urgent  development,  however,  is  to  test  the 
method  with  the  multivariate  state  vectors  of  a  state- 
of-the-art  OGCM.  This  is  the  subject  of  our  present 
research. 
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